76 research outputs found

    The Silently Shifting Semicolon

    Get PDF
    Memory consistency models for modern concurrent languages have largely been designed from a system-centric point of view that protects, at all costs, optimizations that were originally designed for sequential programs. The result is a situation that, when viewed from a programmer\u27s standpoint, borders on absurd. We illustrate this unfortunate situation with a brief fable and then examine the opportunities to right our path

    Retro: Targeted Resource Management in Multi-tenant Distributed Systems

    Get PDF
    Abstract In distributed systems shared by multiple tenants, effective resource management is an important pre-requisite to providing quality of service guarantees. Many systems deployed today lack performance isolation and experience contention, slowdown, and even outages caused by aggressive workloads or by improperly throttled maintenance tasks such as data replication. In this work we present Retro, a resource management framework for shared distributed systems. Retro monitors per-tenant resource usage both within and across distributed systems, and exposes this information to centralized resource management policies through a high-level API. A policy can shape the resources consumed by a tenant using Retro's control points, which enforce sharing and ratelimiting decisions. We demonstrate Retro through three policies providing bottleneck resource fairness, dominant resource fairness, and latency guarantees to high-priority tenants, and evaluate the system across five distributed systems: HBase, Yarn, MapReduce, HDFS, and Zookeeper. Our evaluation shows that Retro has low overhead, and achieves the policies' goals, accurately detecting contended resources, throttling tenants responsible for slowdown and overload, and fairly distributing the remaining cluster capacity

    Model checking system software with CMC

    Full text link
    Complex systems have errors that involve mishandled cor-ner cases in intricate sequences of events. Conventional test-ing techniques usually miss these errors. In recent years, formal verification techniques such as [5] have gained pop-ularity in checking a property in all possible behaviors of a system. However, such techniques involve generating an ab-stract model of the system. Such an abstraction process is unreliable, difficult and miss a lot of implementation errors. CMC is a framework for model checking a broad class of software written in the C programming language. CMC runs the software implementation directly without deriving an ab-stract model of the code. We used CMC to model check an existing implementation of AODV (Ad Hoc On Demand Dis-tance Vector) routing protocol and found a total of bugs in two implementations [7],[6] of the protocol. One of them is a bug in the actual specification of the AODV protocol [3]. We also used CMC on the IP Fragmentation module in the Linux TCP/IPv4 stack and verified its correctness for up to fragments per packet.

    SC-Haskell: Sequential Consistency in Languages That Minimize Mutable Shared Heap

    Get PDF
    A core, but often neglected, aspect of a programming language design is its memory (consistency) model. Sequential consistency~(SC) is the most intuitive memory model for programmers as it guarantees sequential composition of instructions and provides a simple abstraction of shared memory as a single global store with atomic read and writes. Unfortunately, SC is widely considered to be impractical due to its associated performance overheads. Perhaps contrary to popular opinion, this paper demonstrates that SC is achievable with acceptable performance overheads for mainstream languages that minimize mutable shared heap. In particular, we modify the Glasgow Haskell Compiler to insert fences on all writes to shared mutable memory accessed in nonfunctional parts of the program. For a benchmark suite containing 1,279 programs, SC adds a geomean overhead of less than 0.4 on an x86 machine. The efficiency of SC arises primarily due to the isolation provided by the Haskell type system between purely functional and thread-local imperative computations on the one hand, and imperative computations on the global heap on the other. We show how to use new programming idioms to further reduce the SC overhead; these create a virtuous cycle of less overhead and even stronger semantic guarantees (static data-race freedom)

    Sequentializing Parameterized Programs

    Full text link
    We exhibit assertion-preserving (reachability preserving) transformations from parameterized concurrent shared-memory programs, under a k-round scheduling of processes, to sequential programs. The salient feature of the sequential program is that it tracks the local variables of only one thread at any point, and uses only O(k) copies of shared variables (it does not use extra counters, not even one counter to keep track of the number of threads). Sequentialization is achieved using the concept of a linear interface that captures the effect an unbounded block of processes have on the shared state in a k-round schedule. Our transformation utilizes linear interfaces to sequentialize the program, and to ensure the sequential program explores only reachable states and preserves local invariants.Comment: In Proceedings FIT 2012, arXiv:1207.348

    Interoperability-Guided Testing of QUIC Implementations using Symbolic Execution

    Full text link
    The main reason for the standardization of network protocols, like QUIC, is to ensure interoperability between implementations, which poses a challenging task. Manual tests are currently used to test the different existing implementations for interoperability, but given the complex nature of network protocols, it is hard to cover all possible edge cases. State-of-the-art automated software testing techniques, such as Symbolic Execution (SymEx), have proven themselves capable of analyzing complex real-world software and finding hard to detect bugs. We present a SymEx-based method for finding interoperability issues in QUIC implementations, and explore its merit in a case study that analyzes the interoperability of picoquic and QUANT. We find that, while SymEx is able to analyze deep interactions between different implementations and uncovers several bugs, in order to enable efficient interoperability testing, implementations need to provide additional information about their current protocol state.Comment: 6 page

    Finding Race Conditions in Erlang with Quick Check and PULSE

    Get PDF
    We address the problem of testing and debugging concurrent, distributed Erlang applications. In concurrent programs, race conditions are a common class of bugs and are very hard to find in practice. Traditional unit testing is normally unable to help finding all race conditions, because their occurrence depends so much on timing. Therefore, race conditions are often found during system testing, where due to the vast amount of code under test, it is often hard to diagnose the error resulting from race conditions. We present three tools (Quick Check, PULSE, and a visualizer) that in combination can be used to test and debug concurrent programs in unit testing with a much better possibility of detecting race conditions. We evaluate our method on an industrial concurrent case study and illustrate how we find and analyze the race conditions

    Fault-Aware Neural Code Rankers

    Full text link
    Large language models (LLMs) have demonstrated an impressive ability to generate code for various programming tasks. In many instances, LLMs can generate a correct program for a task when given numerous trials. Consequently, a recent trend is to do large scale sampling of programs using a model and then filtering/ranking the programs based on the program execution on a small number of known unit tests to select one candidate solution. However, these approaches assume that the unit tests are given and assume the ability to safely execute the generated programs (which can do arbitrary dangerous operations such as file manipulations). Both of the above assumptions are impractical in real-world software development. In this paper, we propose CodeRanker, a neural ranker that can predict the correctness of a sampled program without executing it. Our CodeRanker is fault-aware i.e., it is trained to predict different kinds of execution information such as predicting the exact compile/runtime error type (e.g., an IndexError or a TypeError). We show that CodeRanker can significantly increase the pass@1 accuracy of various code generation models (including Codex, GPT-Neo, GPT-J) on APPS, HumanEval and MBPP datasets.Comment: In the proceedings of Advances in Neural Information Processing Systems, 202

    Ranking LLM-Generated Loop Invariants for Program Verification

    Full text link
    Synthesizing inductive loop invariants is fundamental to automating program verification. In this work, we observe that Large Language Models (such as gpt-3.5 or gpt-4) are capable of synthesizing loop invariants for a class of programs in a 0-shot setting, yet require several samples to generate the correct invariants. This can lead to a large number of calls to a program verifier to establish an invariant. To address this issue, we propose a {\it re-ranking} approach for the generated results of LLMs. We have designed a ranker that can distinguish between correct inductive invariants and incorrect attempts based on the problem definition. The ranker is optimized as a contrastive ranker. Experimental results demonstrate that this re-ranking mechanism significantly improves the ranking of correct invariants among the generated candidates, leading to a notable reduction in the number of calls to a verifier.Comment: Findings of The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP-findings 2023
    corecore